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DETAILED ACTION 



Claim Rejections - 35 USC § 102 

1 . The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351(a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 

2. Claims 1 -5, 14, 15, and 25 - 30 are rejected under 35 U.S.C. 102(e) as being 
anticipated by Pitman et al., (US PAP 2002/0143530). 

As per claim 1 , Pitman et al., a feature quantity extracting apparatus comprising: 
a frequency transforming section for performing a frequency transform on a signal 
portion corresponding to a prescribed time length, which is contained in an inputted 
audio signal, to derive a frequency spectrum from the signal portion ("the audio signal is 
sampled and a frequency transform is performed on a succession of set of samples"; 
Abstract, lines 1 -3; paragraph 30, lines 5 - 7); 

a band extracting section for extracting a plurality of frequency bands from the 
frequency spectrum derived by the frequency transforming section and for outputting 
band spectra which are respective frequency spectra of the extracted frequency bands 
("frequency bands"; Abstract, lines 5, and 6; paragraph 32, line 11); and 
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a feature quantity calculating section for calculating respective prescribed 
feature quantities of the band spectra, the feature quantity calculating section obtaining 
the calculated prescribed feature quantities as feature quantities of the audio signal 
("extract features from unknown audio content"; paragraph 33; paragraph 54). 

As per claim 2, Pitman et al., further disclose that the band extracting section 
extracts the plurality of frequency bands obtained by dividing the frequency spectrum, 
which has been derived by the frequency transforming section, at uniform intervals on a 
linear scale of a frequency axis ("sampling an audio signal at a rate of 22050 Hz... each 
of which has a duration of 2/21 .5 seconds and includes 2048 samples"; paragraph 30, 
lines 18-22). 

As per claim 3, Pitman et al., further disclose that the band extracting section 
extracts the plurality of frequency bands obtained by dividing the frequency spectrum, 
which has been derived by the frequency transforming section, at uniform intervals on a 
logarithmic scale of a frequency axis ("logarithmic scale"; paragraph 32). 

As per claim 4, Pitman et al., further disclose that the band extracting section 
extracts only frequency bands within a prescribed frequency range from the frequency 
spectrum derived by the frequency transforming section ("a best match can be reported 
for each predetermined interval"; paragraph 54). 
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As per claim 5, Pitman et al., further disclose that the band extracting section 
extracts frequency bands so as to generate a prescribed space between adjacent 
frequency bands extracted ("equally spaced"; paragraph 32; paragraph 52, lines 11 - 

13). 

As per claim 14, Pitman et al., further disclose that the feature quantity 
calculating section calculates, as the prescribed feature quantities, effective values of 
respective frequency spectra of the frequency bands ("running average is taken of each 
semitone frequency band"; Abstract, lines 6, and 7). 

As per claim 1 5, Pitman et al., further disclose that the frequency transforming 
section extracts from the audio signal the signal portion corresponding to a prescribed 
time length at prescribed time intervals ("the audio signal is sampled and a frequency 
transform is performed on a succession of set of samples"; Abstract, lines 1 -3; 
paragraph 30, lines 5 - 7), and 

wherein the feature quantity calculating section includes: an effective value 
calculating section for calculating effective values of respective frequency spectra of the 
band spectra ("running average is taken of each semitone frequency band"; Abstract, 
lines 6, and 7); and 

an effective value time variation calculating section for calculating, as the 
prescribed feature quantities, numerical values related to respective time variation 
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quantities of the effective values calculated by the effective value calculating section ( u a 
sequence of numerical values of frequency"; paragraph 49, lines 1 -5). 

As per claim 25, Pitman et al., teach a feature quantity extracting apparatus 
comprising: a frequency transforming section for performing a frequency transform on a 
signal portion corresponding to a prescribed time length, which is contained in an 
inputted audio signal, to derive frequency spectra from the signal portion ("the audio 
signal is sampled and a frequency transform is performed on a succession of set of 
samples"; Abstract, lines 1 -3; paragraph 30, lines 5 - 7); 

an envelope curve deriving section for deriving envelope signals which 
represents envelop curves of the frequency spectra derived by the frequency 
transforming section ("spectrum information"; paragraph 32, lines 8- 11); and 

a feature quantity calculating section for calculating, as feature quantities of the 
audio signal, numerical values related to respective extremums of the envelope signals 
derived by the envelope curve deriving section ("local maximum or minimum"; 
paragraph 35, lines 1 - 3). 

As per claim 26, Pitman et al., further disclose that the feature quantity 
calculating section obtains, as the feature quantities of the audio signal, extremum 
frequencies each being a frequency corresponding to one of the extremums of the 
envelope signals derived by the envelope curve deriving section ("an extremum in a 
semitone frequency channel"; paragraph 35, lines 1 - 3). 
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As per claim 27, Pitman et al., further disclose that the feature quantity 
calculating section includes: an extremum frequency calculating section for calculating 
the extremum frequencies each being a frequency corresponding to one of the 
extremums of the envelope signals derived by the envelope curve deriving section ("an 
extremum in a semitone frequency channel"; paragraph 35, lines 1 - 3); and 

a space calculating section for calculating spaces between adjacent extremum 
frequencies as the feature quantities of the audio signal ("evenly spaced frequency 
band"; paragraph 32). 

As per claim 28, Pitman et al., further disclose that the space calculating section 
obtains, as the feature quantities of the audio signal, numerical values which represent 
a space as a ratio to a prescribed reference value ("equally spaced on a logarithmic 
scale"; paragraph 32, lines 1-4). 

As per claims 29, and 30, Pitman et al., further disclose that the space 
calculating section obtains, as the prescribed reference value, the lowest of the 
extremum frequencies; a value of difference between the lowest and the second lowest 
of the extremum frequencies ("evenly spaced frequency band"; paragraph 35, lines 1 - 
4; paragraph 32). 
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Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claims 6 -13, 16-24, and 31 - 34 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Pitman et aL, (US PAP 2002/0143530) in view Ellis et al., (US 
Patent 5,504,518). 

As per claim 6, Pitman et al., do not specifically teach that the feature quantity 
calculating section calculates peak values corresponding to values at respective peaks 
of the band spectra, and obtains, as the prescribed feature quantities, values of 
difference between peak values of frequency bands. 

Ellis et al., teach that segment recognition sub-system may detect multiple 
matches on a given key signature for consecutive frames; examines the run structure in 
the segment signature and generates an anticipated peak value width therefrom (col.45, 
lines 25 -31). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to generate a peak value among consecutive frames as taught 
by Ellis et al., in Pitman et al., because that would help better identify the audio content. 

As per claim 7, Ellis et al., further disclose the feature quantity calculating section 
uses binary values to represent the values of difference between peak values of 
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frequency bands, the binary values indicating a sign of a corresponding one of the 
values of difference ("binary value"; col. 15, lines 37 - 40). 

As per claim 8, Pitman et al., do not specifically teach that the feature quantity 
calculating section calculates peak frequencies corresponding to frequencies at 
respective peaks of the band spectra, and obtains, as the prescribed feature quantities, 
numerical values related to the calculated peak frequencies. 

Ellis et al., teach that segment recognition sub-system may detect multiple 
matches on a given key signature for consecutive frames; examines the run structure in 
the segment signature and generates an anticipated peak value width therefrom (col.45, 
lines 25-31). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to generate a peak value among consecutive frames as taught 
by Ellis et al., in Pitman et al., because that would help better identify the audio content. 

As per claim 9, Ellis et al., further disclose that the feature quantity calculating 
section calculates, as the prescribed feature quantities, values of difference between 
peak frequencies of frequency bands ("detect multiple matches on a given key signature 
for consecutive frames, and generate an anticipated peak value"; col.45, lines 25 - 31 ). 

As per claim 10, Ellis et al., further disclose that the feature quantity calculating 
section represents the prescribed feature quantities using binary values indicating 
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whether a corresponding one of the values of difference between peak frequencies of 
frequency bands is greater than a prescribed value ("one binary value for positive 
elements"; col. 1 5, lines 37 - 40). 

As per claim 1 1 , Pitman et al., further disclose that the frequency transforming 
section extracts from the audio signal the signal portion corresponding to a prescribed 
time length at prescribed time intervals ("the audio signal is sampled and a frequency 
transform is performed on a succession of set of samples"; Abstract, lines 1 -3; 
paragraph 30, lines 5 - 7). 

However, Pitman et al., do not specifically teach a peak frequency calculating 
section for calculating peak frequencies corresponding to frequencies at respective 
peaks of the band spectra; and a peak frequency time variation calculating section for 
calculating, as the prescribed feature quantities, numerical values related to respective 
time variation quantities of the peak frequencies calculated by the peak frequency 
calculating section. 

Ellis et al., teach that segment recognition sub-system may detect multiple 
matches on a given key signature for consecutive frames; examines the run structure in 
the segment signature and generates an anticipated peak value width therefrom (col.45, 
lines 25-31). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to generate a peak value among consecutive frames as taught 
by Ellis et al., in Pitman et al., because that would help better identify the audio content. 
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As per claim 12, Ellis et al., further disclose that the peak frequency time 
variation calculating section obtains, as the prescribed feature quantities, binary values 
indicating a sign of a corresponding one of the time variation quantities of the peak 
frequencies ("binary value"; col.15, lines 37 -40). 

As per claim 13, Ellis et al., further disclose that the peak frequency time 
variation calculating section obtains, as the prescribed feature quantities, binary values 
indicating whether a corresponding one of the time variation quantities of the peak 
frequencies is greater than a prescribed value ("one binary value for positive elements"; 
col.15, lines 37 - 40). 

As per claim 16, Pitman et al., do not specifically teach that binary values 
indicating a sign of a corresponding one of the time variation quantities of the effective 
values. 

Ellis et al., teach that positive elements of the vector are assigned one binary 
value, while negative elements are assigned the other binary value (col.15, lines 37 - 
41). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to use binary values as taught by Ellis et al., in Pitman et al., 
because that would help determine when the peak value is either positive or negative. 
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As per claim 17, Pitman et al., do not specifically teach that the effective value 
time variation calculating section obtains, as the prescribed feature quantities, binary 
values indicating whether a corresponding one of the time variation quantities of the 
effective values is greater than a prescribed value. 

Ellis et al., teach that positive elements of the vector are assigned one binary 
value, while negative elements are assigned the other binary value (col. 15, lines 37 - 
41). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to use binary values as taught by Ellis et al., in Pitman et al., 
because that would help determine when the peak value is either positive or negative. 

As per claim 18, Pitman et al., further disclose that the frequency transforming 
section extracts from the audio signal the signal portion corresponding to a prescribed 
time length at prescribed time intervals ("the audio signal is sampled and a frequency 
transform is performed on a succession of set of samples"; Abstract, lines 1 -3; 
paragraph 30, lines 5-7). 

However Pitman et al., do not specifically teach calculating section calculates a 
cross-correlation value between a frequency spectrum of a frequency band extracted by 
the band extracting section and another frequency spectrum on the same frequency 
band in a signal portion different from the signal portion from which the frequency band 
extracted by the band extracting section is obtained, the cross-correlation value being 
calculated for each frequency band extracted by the band extracting section, and the 
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feature quantity calculating section using, as the feature quantities, numerical values 
related to the cross-correlation values. 

Ellis et al., teach producing signatures characterizing respective intervals of a 
broadcast signal exhibiting correlation between at least some of said respective 
intervals for use in broadcast segment recognition. The correlator performs the 
requested matching operation and supplies the match results, along with the relevant 
information such as corresponding error count (col. 5, lines 16-19; col. 1 1 , lines 8-11). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to determine correlation between some intervals as taught by 
Ellis et al., in Pitman et al., because that would help better identify the audio content by 
supplying matching results to the segment recognition. 

As per claim 19, Pitman et al., further disclose that binary values indicating a sign 
of a corresponding one of the cross-correlation values (col. 15, lines 37 - 41 ). 

As per claim 20, Pitman et al., further disclose that the feature quantity 
calculating section calculates, as the prescribed feature quantities, numerical values 
related to respective time variation quantities of the calculated cross-correlation values 
("numerical values of frequency"; paragraph 49, lines 1 - 3). 

As per claim 21 , Pitman et al., teach a feature quantity extracting apparatus 
comprising: a signal extracting section for extracting from an extracted audio signal a 

0 
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plurality of signal portions each corresponding to a prescribed time length ("extract 
features from unknown audio content"; paragraph 54, lines 1 - 10); 

However Pitman et al., do not specifically teach calculating a cross-correlation 
value between one of the plurality of signal portions extracted by the signal extracting 
section and another of the plurality of signal portions, the feature quantity calculating 
section obtaining a numerical value related to the calculated cross-correlation value as 
a feature quantity of the audio signal. 

Ellis et al., teach producing signatures characterizing respective intervals of a 
broadcast signal exhibiting correlation between at least some of said respective 
intervals for use in broadcast segment recognition. The correlator performs the 
requested matching operation and supplies the match results, along with the relevant 
information such as corresponding error count (col. 5, lines 16-19; col.11, lines 8- 11). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to determine correlation between some intervals as taught by 
Ellis et al., in Pitman et al., because that would help better identify the audio content by 
supplying matching results to the segment recognition. 

As per claim 22, Ellis et al., further disclose that the feature quantity calculating 
section obtains the cross-correlation value as the feature quantity of the audio signal 
("supplies a match report for each audio"; col.1 1 , lines 8-13). 
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As per claim 23, Ellis et al., further disclose that the feature quantity calculating 
section obtains a binary value as the feature quantity of the audio signal, the binary 
value indicating a sign of the cross-correlation value ("binary value"; col. 15, lines 37 - 
40). 

As per claim 24, Ellis et al., further disclose that the signal extracting section 
extracts the signal portions at prescribed time intervals, and wherein the feature quantity 
calculating section includes: a cross-correlation value calculating section for calculating 
the cross-correlation value at the prescribed time intervals; and a cross-correlation 
value time variation calculating section for calculating a time variation quantity of the 
cross-correlation value as the feature quantity of the audio signal ('correlation between 
at least some of said respective intervals for use in broadcast segment recognition ; and 
supplies a match report for each audio"; col.1 1 , lines 8-13). 

As per claims 31 - 33, Pitman et al., teach a recording medium; and a feature 
quantity storage section which stores at least a set of a feature quantity of an audio 
signal and control instruction information associated therewith, (paragraph 54; 
paragraph 25). 

However Pitman et al., do not specifically teach receiving television program data 
containing an audio signal and a video signal, and is capable of recording the television 
program data to a recording medium, wherein the feature quantity extracting apparatus 
obtains a feature quantity of the audio signal contained in the television program data, 
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wherein the program recording apparatus further comprises: a recording control section 
for controlling recording of the television program data to the recording medium; the 
audio signal containing music played in a television program to be recorded, the control 
instruction information instructing the recording control section to perform or stop 
recording of the television program; a feature quantity comparison section for 
determining whether the audio signal contained in the television program data matches 
with the audio signal containing the music played in the television program based on 
both the feature quantity obtained by the feature quantity extracting apparatus and the 
feature quantity stored in the feature quantity storage section, and wherein when the 
feature quantity comparison section determines that the audio signal contained in the 
television program data matches with the audio signal containing the music played in 
the television program, the recording control section performs the control of performing 
or stopping recording of the television program data to the recording medium in 
accordance with an instruction indicated by control instruction information which is 
stored in the feature quantity storage section and associated with a feature quantity of 
the audio signal having been determined as matching with the audio signal containing 
the music played in the television program. 

Ellis et al., teach receiving television broadcast signals over a respective channel 
and demodulates the received signals to provide baseband video and audio signals. 
The video and audio signals are thereafter supplied to the segment recognition 
subsystem wherein frames signatures for each of the video and audio signals are 
generated which are thereafter compared to store key signatures to determine if a 
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match exists (col. 9, lines 55 - 62). The FIR module serve to improve signature stability 
by averaging the audio spectral data over a number of television frames, thus to 
enhance the likelihood of obtaining correct signatures matches (col. 21, lines 64 - 67). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to match audio in a television broadcast as taught by Ellis et al., 
in Pitman et al., because that would help verify whether video and audio signals are 
synchronized during television broadcasting. 

As per claim 34, Ellis et al., further disclose that the program reproduction control 
apparatus further comprises an editing section capable of editing the television program 
data recorded in the recording medium (updating a broadcast segment recognition 
database storing signatures"; col.5, lines 2, and 3). 

Conclusion 

5. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Obrador et al., (US Patent 7,184,955) teach a system and method for indexing 
video based on speaker distinction. 

Kenyon et al., (US 4,450,531 ) teach broadcast signal recognition system and 
method. 

Lamb et al., (US Patent 5,437,050) teach a method and apparatus for 
recognizing broadcast information using multi-frequency magnitude detection. 



Application/Control Number: 10/667,465 
Art Unit: 2626 



Page 17 



6. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Leonard Saint-Cyr whose telephone number is (571) 

272- 4247. The examiner can normally be reached on Mon- Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571 ) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 

273- 8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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